Introduction

Over the past few years, Major League Baseball has seen substantial shifts in what have traditionally been considered the most dominant pitches. Earlier this year, Mets ace Justin Verlander remarked that sliders haven’t been performing as well as in the past, in fact, OPS on sliders is “the highest it’s been in the pitch tracking years”.1 The introduction of the sweeper, an attempt to clarify and designate differences between two types of sliders, and new evaluation metrics such as Stuff+, Location+, and Pitching+, have only added dimension to the evaluation of specific pitch types and are indicative of a growing focus on individual pitch characteristics. While the discrete qualities of a pitch (such as movement, velocity, and spin rate) are undoubtedly useful in evaluating and explaining elite pitchers, the effects that sequencing and in-game pitching strategy have on outcomes cannot be understated. Therefore, integrating inter-pitch dynamics with discrete pitch characteristics might more accurately model a pitcher’s effectiveness. Fangraph’s Stuff+ metric does a satisfactory job in evaluating the efficacy of a pitch based on discrete characteristics, but it is limited in that it fails to consider the art of pitch sequencing. Our motivation behind this project is to both assess current discrete pitch evaluation metrics and consider how inter-pitch dynamics can enhance existing models.



Data

The data for our project was compiled from Fangraphs and Baseball Savant. All of the data used in our research are season statistics separated by year, pitcher, and pitch type from the 2020-2022 seasons where the specified pitch had 250 or more recorded pitches for the given season. The Stuff+ data, both pitcher and pitch specific values, are from Fangraphs. All other data, including movement and pitch metric data (e.g. spin rate, xwOBA, pitches thrown) was collected from Baseball Savant.

Exploratory Data Analysis (EDA)

Note Recategorization:
“Sliders” includes Slurves and Sweepers
“Curveballs” includes Knuckle Curves
“Changeup” includes Splitters

Our initial exploratory data analysis revolved around understanding pitch characteristics such as movement, velocity and spin. We began by charting some average values by pitch type: vertical and horizontal break, the proportion of time that they’re thrown, spin rate in revolutions per minute, and speed.


Pitch Horizontal Vertical Pitch Proportion Spin Rate Speed
4-Seamer 7.45 14.86 0.46 2285.29 93.93
Sinker 15.00 22.89 0.39 2127.16 93.21
Cutter 2.88 25.97 0.34 2380.57 88.97
Splitter 11.71 33.09 0.28 1459.77 86.37
Slider 6.42 36.28 0.34 2432.44 84.94
Changeup 14.03 32.27 0.26 1754.87 84.59
Curveball 9.45 53.35 0.26 2572.18 79.64

Next, we turned our attention to the popular Stuff+ metric used to evaluate the success of pitches. Stuff+ factors in the discrete characteristics we’ve explored and more to evaluate pitchers - our question was, do pitchers know what they do well? Can Stuff+ actually predict usage? In doing this, we wanted to see if pitchers “knew” which of their pitches were most effective and threw those pitches in higher proportions.

Our graph shows that as Stuff+ increases, pitch usage tends to increase as well - indicating that pitchers do in fact throw their best pitches (as determined by Stuff+) more frequently. Our graph is colored by weighted on base percentage (wOBA) to further show that pitches with high Stuff+ ratings are correlated with lower opposing wOBA.

Pitch-by-Pitch Breakdowns:

Next, we wanted to know how each of these discrete characteristics might carry a different level of importance based on the pitch type. Is vertical break more important than speed for curveballs, and if so, how much more important is it? To explore these questions we created variable importance plots for each of our interest variables across all pitch types. Our outcome variables - wOBA, xwOBA, and run value per 100 - were selected because they are tangible result-based metrics that point toward the effectiveness of a pitch.

Intra-Pitch Data Analysis

Variable Importance Plots

Sinkers, Cutters, and Four-Seam Fastballs have been aggregated into a singular “Fastball” category.

wOBA

Fastballs

Sliders

Curveballs

Change-Ups

xwOBA

Fastballs

Sliders

Curveballs

Change-Ups

Run Value / 100

Fastballs

Sliders

Curveballs

Change-Ups

These importance plots should be considered only within themselves, not across various plots. From these plots, we can see what variables have similar importance levels to each other within a particular pitch type and how variable importance changes based on the pitch and outcome response variable.

Pitch Characteristics

wOBA

Four-Seam

Cutters

Sinkers

Sliders

Curveballs

Change-Ups

xwOBA

Four-Seam

Cutters

Sinkers

Sliders

Curveballs

Change-Ups

Run Value / 100

Four-Seam

Cutters

Sinkers

Sliders

Curveballs

Change-Ups

explain pitch characteristics

Pitch Movement

wOBA

Four-Seam

Cutters

Sinkers

Sliders

Curveballs

Change-Ups

xwOBA

Four-Seam

Cutters

Sinkers

Sliders

Curveballs

Change-Ups

Run Value / 100

Four-Seam

Cutters

Sinkers

Sliders

Curveballs

Change-Ups

explain pitch movement

Intra-Pitch xwOBA GAM

Four-Seam

Cutters

Sinkers

Sliders

Curveballs

Change-Ups

Inter-Pitch Data

Inter-Pitch Data Analysis

Inter-Pitch xwOBA Models

GAM

Four-Seams

Cutters

Sinkers

Sliders

Curveballs

Change-Ups

Random Forest

Four-Seams

Cutters

Sinkers

Sliders

Curveballs

Change-Ups

Looking Ahead

we are going to have lots of fun moving forward. that’s all <3

Acknowledgements

We would like to thank Meg Ellingwood and Shamindra Shrotriya, the leaders of the Carnegie Mellon Summer Undergraduate Research Experience in Statistics, for their invaluable knowledge, guidance, and instruction throughout this research experience. This project would not have been possible without the help of Dr. Ron Yurko and Sean Ahmed, Pirates Director of R&D. We would like to thank the entire Carnegie Mellon Sports Analytics Camp teaching team for their support and guidance during this project.

Citations

[1] Sammon, W., & Sarris, E. (2023, July 7). Fall of the slider: Why are hitters feasting on MLB’s once-deadly breaking ball? The Athletic. https://theathletic.com/4671150/2023/07/07/mlb-sliders-hitters-success/


  1. Ethan Park, University of Southern California, ↩︎

  2. Evan Wu, Elon University, ↩︎

  3. Priyanka Kaul, Harvard University, ↩︎